SimPack: A Generic Java Library for Similarity Measures in Ontologies

نویسندگان

  • Abraham Bernstein
  • Esther Kaufmann
  • Christoph Kiefer
  • Christoph Bürki
چکیده

Good similarity measures are central for techniques such as retrieval, matchmaking, clustering, data-mining, ontology translations, automatic database schema matching, and simple object comparisons. Measures for the use with complex (or aggregated) objects in ontologies are, however, rare, even though they are central for semantic web applications. This paper first introduces SimPack, a library of similarity measures for the use in ontologies (of complex objects). The measures of the library are then experimentally compared with a similarity “gold standard” established by surveying 94 human subjects in two ontologies. Results show that human and algorithm assessments vary (both between people and across ontologies), but can be grouped into cohesive clusters, each of which is well modeled by one of the measures in the library. Furthermore, we show two increasingly accurate methods to predict the cluster membership of the subjects providing the foundation for the construction of personalized similarity measures. Paper Type: Working Paper

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Similarities in Ontologies with the SOQA-SimPack Toolkit

Ontologies are increasingly used to represent the intended real-world semantics of data and services in information systems. Unfortunately, different databases often do not relate to the same ontologies when describing their semantics. Consequently, it is desirable to have information about the similarity between ontology concepts for ontology alignment and integration. This paper presents the ...

متن کامل

The semantic measures library and toolkit: fast computation of semantic similarity and relatedness using biomedical ontologies

UNLABELLED The semantic measures library and toolkit are robust open-source and easy to use software solutions dedicated to semantic measures. They can be used for large-scale computations and analyses of semantic similarities between terms/concepts defined in terminologies and ontologies. The comparison of entities (e.g. genes) annotated by concepts is also supported. A large collection of mea...

متن کامل

Measuring similarity in ontologies: a new family of measures

Without a doubt, similarity measurement is important for numerous applications (e.g., information retrieval, clustering, ontology matching). Several attempts have been already made to develop similarity measures for ontologies. We noticed that some existing similarity measures are ad-hoc and unprincipled. In addition, there is still a need for similarity measures which are applicable to express...

متن کامل

A history of Floral diversity (pollen, spores and algal) during the latest Holocene in the Bandung basin based on palynological analysis in Cihideung, West Java, Indonesia

   Floral diversity is a measure of number of type flora in an area, and reflects how vegetation develops in response to the environmental condition during a certain time interval. The present study aims to examine changes in the diversity of vegetation (pollen, spores and algae), evenness, and similarity in the Bandung Basin through a core of 240 cm depth using a ground drill, as well as  the ...

متن کامل

An Extendable Java Framework for Instance Similarities in Ontologies

We present the conceptual basis and a prototypical implementation of a software framework for syntactical and semantical similarities between ontology instances. Our focus comprises both the implementation of specific, ontology-based similarity measures and their flexible, efficient, and extensible combination.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005